final draft
- Research Report > Promising Solution (0.41)
- Overview > Innovation (0.41)
DualDICE continues to provide more accurate and stable results compared to the baselines, especially in continuous-3
We thank the reviewers for their close reading of the paper and helpful feedback. We are also excited to apply ideas from DualDICE to the policy improvement problem, as mentioned by the reviewers. We are exploring several potential approaches to this problem. How are these assumptions handled practically, e.g. What are the x-axes in figure 2 and figure 4? We will remedy this in the final draft.
Regularization
Finding an appropriate value for hyper-parameters is always a challenge in machine learning problems. We do not yet compare to prior tractography algorithms. The availability of a ground truth Phi let's us focus our investigation into the soundness With the additional space in the camera-ready, we can include more background and discussion on ENCODE. Our solution exploits the inherent sparseness in the optimization. However, this does not need to be run for each step of the optimization and is not expensive to do once upfront.
We sincerely thank all reviewers for the insightful comments and feedback on our work of learning from failure (LfF)
We sincerely thank all reviewers for the insightful comments and feedback on our work of learning from failure (LfF). We do not interpret this as a "true" trade-off, as debiasing does not degrade the model's Instead, we view the apparent underperformance as a result of "not utilizing a (delusional) spurious correlation." Following R1's suggestion, we additionally test ReBias [2] (SOT A among This is also consistent with our claim that LfF is not "domain-specific" However, this consistency may not hold depending on the definition of "domain." Hence, we deeply resonate with R2's concern, and we will further clarify the type of knowledge used by LfF and For example, we will modify L2-5 in the abstract by "In this work, we propose a new algorithm utilizing a However, we only use the LfF's yes/no type of knowledge for choosing one of the attributes as an undesired Following R2's suggestion, we further verify Our LfF combination rule achieves 74.01% We will add more discussions and experiments in the final draft.
We sincerely appreciate insightful comments and positive feedback from the reviewers: important problem (R1
We respond to each comment one by one. We mention this in Line 148; however, we will make it clear in the final draft. Conversely, SSL algorithms use the unlabeled data but they do not consider the class imbalance. We will make this point clear in the final draft. However, to avoid the confusion, we will substitute X,Y to α,β in the final draft.